Convex Optimization: Geometric Foundations of Convex Optimization

Imagine navigating a complex landscape where your goal is to find the closest point to safety. In the language of optimization, this 'safety' is a set $C$, and the act of finding the closest point is a projection. While intuition suggests there is always a single 'closest' point, mathematical reality is more nuanced. The geometric foundation of convex optimization rests on formalizing 'proximity', moving beyond Euclidean intuition to rigorous functional representations like indicator and support functions.

1. Projections and Distances

The distance from a point $x_0$ to a set $C$ is defined as the infimum of all possible distances to points within the set:

$\text{dist}(x_0, C) = \inf\{\|x_0 - x\| \mid x \in C\}$

The specific optimizer that achieves this distance is the projection operator:

$P_C(x_0) = \text{argmin}\{\|x - x_0\| \mid x \in C\}$

For a simple hyperplane defined by $a^T x = b$, the projection has a beautiful closed-form solution: $P_C(x_0) = x_0 + (b - a^T x_0)a/\|a\|_2^2$. However, for general sets, this remains a constrained optimization problem: minimize $\|x - x_0\|$ subject to $f_i(x) \leq 0$ and $Ax = b$.

2. Functional Geometry

To treat geometric constraints as objective components, we employ two powerful functional 'mirrors':

The Indicator Function $I_C(x)$: $I_C(x) = \begin{cases} 0 & x \in C \\ +\infty & x \notin C \end{cases}$. This collapses geometry into a numerical penalty.
The Support Function $S_C(x)$: $S_C(x) = \sup_{y \in C} x^T y$. This characterizes the set by its bounding hyperplanes in every direction.

Theorem: Motzkin's Theorem

A nonempty, closed set $C \in \mathbf{R}^n$ is a Chebyshev set (possessing a unique projection for every $x_0$) if and only if it is convex.

Proof Sketch (Uniqueness)

Suppose $C$ is convex and the norm is strictly convex. If there were two distinct closest points $u, v \in C$ with $\|u - x_0\| = \|v - x_0\| = d$, then their midpoint $w = (u+v)/2$ is in $C$ (by convexity).

By the strict convexity of the norm: $\|w - x_0\| = \|\frac{1}{2}(u - x_0) + \frac{1}{2}(v - x_0)\| < \frac{1}{2}\|u - x_0\| + \frac{1}{2}\|v - x_0\| = d$.

This contradicts the assumption that $d$ was the minimum distance. Thus, the projection must be unique.

Caveat 8.4: Norm Dependency

We often construct a separating hyperplane using: $(P_C(x_0) - x_0)^T (x - (1/2)(x_0 + P_C(x_0))) = 0$. Beware! This specific construction is valid only for the Euclidean norm. General norms require more nuanced treatments of orthogonality.

🎯 Core Insight

Convexity is the 'glue' that ensures stability in geometric optimization. Without it, even the simple task of 'finding the closest point' can result in multiple, conflicting solutions, leading to algorithmic instability.

QUESTION 1

According to Motzkin's Theorem, what are the necessary and sufficient conditions for a nonempty, closed set in Rⁿ to be a Chebyshev set?

It must be bounded and open.

It must be convex.

It must contain the origin.

It must be a subspace.

QUESTION 2

What is the closed-form solution for projecting a point x₀ onto the hyperplane aᵀx = b in Euclidean space?

P_C(x₀) = x₀ + (b - aᵀx₀)a / ||a||₂²

P_C(x₀) = x₀ - (b + aᵀx₀)a / ||a||₂

P_C(x₀) = (b / aᵀx₀)a

P_C(x₀) = x₀ + aᵀx₀ / b

QUESTION 3

Why does a projection onto the set of rank-k matrices (Example 8.2) potentially lack uniqueness?

The set is not convex.

The set is not closed.

Rank is always a linear property.

Matrices do not have norms.

QUESTION 4

Which function allows us to represent a set C as a penalty in an objective function, returning 0 if the point is in the set and ∞ otherwise?

The Support Function S_C(x)

The Indicator Function I_C(x)

The Distance Function dist(x, C)

The Gram Matrix

QUESTION 5

True or False: the construction of a separating hyperplane using the midpoint between x₀ and its projection works for any norm.

True

False